SHP-1 tyrosine phosphatase binding to c-Src kinase phosphor-dependent conformations: A comparative structural framework

SHP-1 is a cytosolic tyrosine phosphatase that is primarily expressed in hematopoietic cells. It acts as a negative regulator of numerous signaling pathways and controls multiple cellular functions involved in cancer pathogenesis. This study describes the binding preferences of SHP-1 (pY536) to c-Srcopen (pY416) and c-Srcclose (pY527) through in silico approaches. Molecular dynamics simulation analysis revealed more conformational changes in c-Srcclose upon binding to SHP-1, as compared to its active/open conformation that is stabilized by the cooperative binding of the C-SH2 domain and C-terminal tail of SHP-1 to c-Src SH2 and KD. In contrast, c-Srcclose and SHP-1 interaction is mediated by PTP domain-specific WPD-loop (WPDXGXP) and Q-loop (QTXXQYXF) binding to c-Srcclose C-terminal tail residues. The dynamic correlation analysis demonstrated a positive correlation for SHP-1 PTP with KD, SH3, and the C-terminal tail of c-Srcclose. In the case of the c-Srcopen-SHP-1 complex, SH3 and SH2 domains of c-Srcopen were correlated to C-SH2 and the C-terminal tail of SHP-1. Our findings reveal that SHP1-dependent c-Src activation through dephosphorylation relies on the conformational shift in the inhibitory C-terminal tail that may ease the recruitment of the N-SH2 domain to phosphotyrosine residue, resulting in the relieving of the PTP domain. Collectively, this study delineates the intermolecular interaction paradigm and underlying conformational readjustments in SHP-1 due to binding with the c-Src active and inactive state. This study will largely help in devising novel therapeutic strategies for targeting cancer development.


Introduction
Protein tyrosine phosphorylation is a post-translational modification that plays an essential role in cell growth, proliferation, and differentiation [1,2]. Tyrosine phosphorylation is a fundamental mechanism in the eukaryotic cellular signaling pathways [3]. The protein tyrosine phosphorylation level is rigorously controlled by two types of enzymes that exert opposite biological functions [4,5] protein-tyrosine phosphatases (PTPs) and protein-tyrosine kinases (PTKs). PTPs counterbalance the PTK phosphorylation process through dephosphorylation of the phosphorylated tyrosine [5][6][7]. The structures and functions of PTKs have been widely a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Molecular docking analysis
Receptor-ligand interactions play significant roles in various biological processes, and the knowledge of molecular associations may help in understanding numerous cellular pathways [46]. In this study, minimized SHP-1 was docked against c-Src open and c-Src close through the ClusPro [47], PatchDock [46] and an embedded refinement tool FireDock [48] to describe their interaction patterns. PatchDock accomplishes docking process through a segmentation algorithm based on the structure geometry. It recapitulates docking transformations that yield good complementary molecular shapes based on a small number of steric clashes and wide interface areas. PatchDock algorithm classifies the Connolly dot surface representation of the protein molecules as concave, convex, and flat patches [49]. The complementary patches are matched to generate the candidate transformations. A scoring function evaluates each candidate transformation, which considers both the atomic desolvation energy and geometric fit [50] to measure each candidate transformation. Finally, the most suitable candidate solution is selected among the redundant solutions based on RMSD (Root Mean Square Deviation) clustering. Overall, three major steps are followed in the PatchDock analysis: (i) surface patch matching, (ii) molecular shape representation, and (iii) filtering and scoring [51]. In order to confirm our docking results, SHP-1 binding to c-Src open, and c-Src close was evaluated by Clu-sPro [52]. ClusPro server (https://cluspro.org) is a widely used protein-protein docking tool that performs docking in three steps: (1) rigid body docking, (2) RMSD-based clustering of the 1000 lowest energy structures, and (3) the removal of steric clashes by energy minimization.

Molecular dynamics simulation assay
In order to gain further insight and evaluate dynamic behavior, conformational changes, and interaction stability, molecular dynamics (MD) simulation runs were performed for 300 ns. We used the CHARMM-GUI input generator [40] to build the systems for MD runs. All MD simulations were accomplished through GROMACS 5.1.6. [53] using CHARMM36 force field [40] and TIP3P water model [54]. All complexes were initially centered in the cubic periodic boxes having following dimensions: (SHP1: 12 x 12 x 12 nm, c-Src open : 12.5 x 12.5 x 12.5, c-Src close : 9.7 x 9.7 x 9.7, c-Src open -SHP1 complex: 14.2 x 14.2 x 14.2 nm, and c-Src close -SHP1: 14.7 x 14.7 x 14.7 nm). The distance between the solute and the edge of the water box was 10-15 nm. System neutralization was accomplished by adding appropriate numbers of Na + and Clcounter-ions. MD simulation runs were executed under a constant pressure (1 atm) and temperature (303K) using a Nose-Hoover thermostat [55]. LINCS algorithm [56] was applied to constrain all bonds having hydrogen atoms with an integration time step of 2 fs. Long-range electrostatic interactions were estimated with a cut-off value of 1 nm for the direct interaction through fast, smooth Particle-Mesh Ewald (PME) summation [57]. Each system encountered a similar equilibration process under NVT [58] conditions and 1 fs integration time step covering 1000 ps MD simulations. An additional 1000 ps MD simulation under NPT [59] conditions with an isotropic Parrinello-Rahman barostat [60,61] for 2 ps time period was performed to complete the equilibration stage. Finally, a 300 ns MD trajectory files were attained for each system. PDB files were generated for every 25 ns interval to evaluate and assess the system stability and structural changes. All MD trajectories were investigated through UCSF Chimera 1.11 and GROMACS tools. The behavior and stability of each system was scrutinized through GROMACS modules such as g_rms, g_rmsf, g_hbond and g_covar modules.

Dynamic cross-correlation analysis
In an MD simulation environment, atoms are placed under classical mechanics constraints to characterize their behavior in terms of 3D coordinates. A dynamical system is needed to investigate how these Newtonian forces affect the atomic motions. In the dynamic cross-correlation method, neighboring atom movements are evaluated. Hence, dynamic correlation is a measure of atomic movements in a changing environment concerning other atoms. The dynamic correlation analysis was implemented using a Python script named as calc_correlation.py embedded in the MD-TASK module (https://github.com/RUBi-ZA/MD-TASK). Input files were in the form of a trajectory (.xtc format) and topology (.gro format). A GROMACS library named as mdtraj library (https://mdtraj.org/) gets input in the form of trajectory and topology files that are combined for the coordinate and atomic adjustments. Subsequently, NumPy (https:// numpy.org/) plugin was used to calculate the dynamic correlation values for the whole protein backbone under study. The vectorized NumPy assures hardware optimization, and values are calculated based on the following formula (Eq 1) [62].
Where Cij denotes the given correlation between the ith entity with respect to the jth one given the correlation values r for respective entities i and j [63]. The formula for each r is given as Eq 2: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Where x and y are the two respective atoms for which the three-dimensional positions are negated from their mean position to understand the correlation. Value bounds for the dynamic correlation are the same as the canonical Pearson formula: +1 to -1, where +1 means perfect correlation, -1 means perfect anti-correlation, and 0 implies no correlation. After calculating the dynamic correlation, an NxN matrix was generated in a.txt format, where N designates the residue numbers in the system. Subsequently, correlation values were mapped in the form of a heatmap through an in-house made script named plot_regions.py which extends the MD-Task correlation script (https://github.com/RUBi-ZA/MD-TASK). Here, the square matrix is given as input along with the start and end regions of the protein to be plotted as a heatmap. The script plots the whole protein in a.jpeg image format if no start and end regions are given. Files are submitted as input in the.txt format, and the output is generated using Matplotlib heatmap function (https://matplotlib.org/ and https://seaborn.pydata.org/).

Structural evaluation for c-Src open, c-Src close and SHP1
Human SHP-1 comprises two SH2 domains at the N-terminal regions, followed by a catalytic domain (PTP domain) and a C-terminal tail (Fig 1A and 1B) [7,9]. Human c-Src is comprised of SH3 (84-145aa) and SH2 (151-248aa) domains, a linker region (249-269aa), a kinase domain (270-523aa), and a C-terminal tail (524-533aa), respectively (residue numbers were referred corresponding to chicken c-Src) (Fig 1C-1E) [65,66]. The comparison of SHP-1 structure with the Alphafold structure has been included in S1 Fig and an RMSD value of 0.428 angstroms was obtained, indicating that these structures are exactly similar. Ramachandran scores for c-Src open (98.00%), c-Src close (93.71%), and SHP1 (94.44%) structures suggested that significant numbers of residues were lying in the sterically allowed regions. The Ramachandran plot combines the four separate Ramachandran maps (as shown to the right) using shapes to distinguish the membership to a particular class (General, Proline, Glycine, and Pre-Proline). Within this scheme, Glycines are shown as diamonds, Prolines as triangles, residues preceding Prolines (pre-Proline) have a rectangular shape, otherwise (General case) they are drawn as small squares (S1 Table in

MD simulation analysis
To validate the findings of MD simulation runs, we performed three replicas of 300 ns simulations for apo-SHP-1 and its complexes with c-Src open , c-Src close , apo c-Src open , and c-Src close to achieve the coverage of energetic and conformational space (S6 Fig). Subsequently, the average values were computed from these replicas and plotted. RMSD profile analysis exhibited true structural convergence for all systems (Fig 2A and 2B)

Dynamic correlation and Circos analysis
To understand the movement of the protein under study, this study used dynamic cross-correlation technique. Dynamic correlation implicated the combinatorial atomic movement. Correlation in the positive 1 direction showed that the atoms were positively correlated. The reverse was true for the negative 1 direction, where 0 meant no correlation. These values were calculated in a highly dynamic environment through the molecular dynamic simulation setting.
Circos plots were used to visualize the domains with a >0.7 or 70% cross-correlation value. Correlation heatmaps provide the correlation landscape of residues lying inside the protein in a pairwise manner. Circos plots were introduced to highlight the inter-domain movement. In a Circos plot, the protein box is represented as the outermost circle that is located at the top with a grey box and moves clockwise along the sequence length. The outbound circle represents the domains. The inbound one indicates sequence, and the line bars that flow from one domain to the other represent correlations higher than or equal to 70%. The intra-domain correlation was omitted here to avoid cluttering in the plots since the inter-domain regions were taken into consideration. Besides this rationale, each domain needs to be correlated to evaluate the collective residue movement in a given domain. In SHP-1, N-SH2 and C-SH2 domains were highly correlated with the PTP domain. Interestingly, anti-correlation was observed between C-terminal and N-SH2 domain of SHP-1 (Fig 4A and S8A Fig). In the case of c-Src open , there was a sparse inter-domain correlation. SH2 domain was positively correlated with the SH3 and linker region. Evidently, Circos plot revealed a positive correlation among domains. A more negative correlation was observed between the SH2, kinase domain and the

PLOS ONE
Structural analysis of SHP-1 dependent c-Src activation C-terminal tail. SH3 domain also showed an anti-correlation pattern between the linker and subsequent domain (Fig 4B and S8 Fig). In case of c-Src close , the SH2 domain was positively correlated with SH3, while the SH3 domain was positively correlated with the C-terminal tail. The C-terminal region was also positively correlated with the kinase domain. A negative correlation was witnessed between the kinase domain and SH2 and SH3 domains (Fig 4C and S8  Fig). In the c-Src open -SHP-1 complex, the correlation heatmap and Circos plots demonstrated intense correlation in the whole landscape (Fig 4D and S8D Fig). Interesting correlations could be seen in the c-Src close coupled with SHP-1 (Fig 4E and S8E Fig). At the intra-domain level, SH3, SH2, and the linker region of c-Src possessed positive correlations with KD, while the linker region was correlated with the C-terminal region. In the case of SHP-1, the PTP domain exhibited a positive correlation with SHP-1 C-terminus. The intermolecular interaction analysis revealed regions with a positive correlation for c-Src close -specific SH3, SH2, and C-terminal tail with the SHP-1 PTP domain. Additionally, the c-Src close C-terminal tail demonstrated a positive correlation with the SHP-1 N-SH2 and C-terminal tail (Fig 4E and  S8E Fig). Discussion SHP-1 is a cytosolic tyrosine phosphatase that is primarily expressed in the hematopoietic cells and controls many cellular functions to control the flow of information from the cell membrane to the nucleus [11][12][13][14]. It acts as a negative regulator in numerous signaling pathways [67]. Recently, it has been reported that c-Src associates with the SH2 domain of SHP-1 in both platelets and lymphocytes cells [19] and phosphorylates at Y536 residue of SHP-1 C-terminal tail leading to enhanced activity of phosphatase [19,22]. Subsequent dephosphorylation of c-Src pY527 by SHP-1 results in the activation of Src tyrosine kinase [21]. Although the role of SHP-1 in maintaining the overall phosphorylation status and activity of c-Src is well known, their binding characteristics and the key residue involvements in the activity paradigm are largely unknown.
Current study characterizes the binding patterns of c-Src open (PTR416) and c-Src close (PTR527) with SHP-1 (PTR536) through in silico approaches. Convincingly, in case of c-Src open , MD simulation analysis revealed the participation of both SH2 and SH3 domains in binding to the SHP-1 N-SH2 domain. In contrast, an active role of c-Src close C-terminal tail was observed in the interaction with SHP-1 PTP domain. Intriguingly, more conformational readjustments were witnessed in c-Src open upon binding to SHP-1 (Fig 5). SHP-1-specific 33 -RKNQG-37 motif (N-SH2) association with SH3 (D117) and SH2 (R160, A165, and E166) domains of c-Src open induces the recognition-induced conformational changes in the SH2 domain for binding to SHP-1 C-terminal tail (Fig 5). Consequently, R160 residue pairs with N35 and Y536 residues of SHP1 through hydrogen bonding and associates with SHP-1 E535 side chain through a salt bridge (S3 Table in S1 File). These findings are in good agreement with the previous studies where both SH3 and SH2 domains cooperatively regulate c-Src open activity through a physical interaction [68]. As compared to apo-SHP1, N-SH2 domain moves opposite to the C-terminal tail upon binding to c-Src open . Evidently, apo-SHP1 C-tail covers the active site of the PTP domain and upon c-Src recognition, N-SH2 domain pushes it aside to make the active site more accessible (Fig 5). Upon substrate binding, N-SH2 domain moves away from the PTP active site, as reported for the phosphatase-active form of SHP-1/2 [7,9], while in the corresponding inactive form, this domain keeps SHP protein in an auto-inhibitory state through blocking the active site. Mechanistically, the reported SHP-1 and SHP-2 crystal structures lack C-tails, and SHP-1 N-SH2 D'E loop enters into the active site of PTP domain and prevents the substrate binding to phosphatase [7,9]. In this study, through detailed binding analysis of active versus inactive c-Src to SHP-1 (including C-terminal tail), we anticipate that cooperative binding of both SH2 domains and the C-terminal tail of SHP-1 to SH2 and SHP1-dependent c-Src activation through pY527 dephosphorylation [70] relies on the conformational change in the inhibitory C-terminal tail that eases the recruitment of the SH2 domain to phosphotyrosine residue. As a result, inhibition of the phosphatase domain is lifted by the N-SH2 domain of SHP-1. Similar observations have been reported for Lyn-induced phosphorylation of SHP-1 Y536 residue that mediates a conformational change to recruit SHP-1 through its SH2 domains to the phosphotyrosine residue leading to the activation of phosphatase inhibitory potential [71]. Another report suggests binding of SHP2 (a close homolog of SHP1) to c-Src to dephosphorylate it at Y527/Y530 [72]; however, a phosphataseinactive mutant of SHP-2 is also capable of c-Src activation, suggesting that c-Src activity may not solely be regulated by the dephosphorylation process [73].
In the case of the c-Src close -SHP-1 complex, significant contributions of WPD-loop (WPDXGXP) and Q-loop (QTXXQYXF) of the SHP-1 PTP domain were observed in the transition of close-to-open conformation. The binding preferences for c-Src close C-terminal tail residues (T523, E524, Q526, pY527, and N532) to WPD-loop D419 and SHP-1 PTP domain (K232, K356, H420, V422, and S424) play a critical role in promoting the dephosphorylation activity and activation of c-Src (S3 Table in S1 File). These findings are in good agreement with our dynamic correlation analysis, where c-Src close C-terminal tail (524-533 aa) is positively correlated with SHP-1 PTP domain and KD, and movement of SH2-SH3 region is highly correlated to that of SHP-1 C-SH2 domain, suggesting their cooperative role in promoting the c-Src closed conformation (Fig 4). Collectively, our findings delineate the comparative analysis of SHP-1 binding to both active and inactive forms of c-Src to uncover the underlying global conformational switches that govern the state transition basis of Src due to individual movements of regulatory domains. Our study will expand the previous knowledge of SHP-1-dependent tyrosine dephosphorylation and add significant information regarding the kinase activation to curb cancer development. In the case of the c-Src close -SHP-1 complex,